Frameworks for multivariate m-mediods based modeling and classification in Euclidean and general feature spaces
نویسندگان
چکیده
This paper presents an extension of m-mediods based modeling technique to cater for multimodal distributions of sample within a pattern. The classification of new samples and anomaly detection is performed using a novel classification algorithm which can handle patterns with underlying multivariate probability distributions. We have proposed two frameworks, namely MMC-ES and MMC-GFS, to enable our proposed multivarite m-mediods based modeling and classification approach workable for any feature space with a computable distance metric. MMC-ES framework is specialized for finite dimensional features in Euclidean space whereas MMC-GFS works on any feature space with a computable distance metric. Experimental results using simulated and complex real life dataset show that multivariate m-mediods based frameworks are effective and give superior performance than competitive modeling and classification techniques especially when the patterns exhibit multivariate probability density functions. & 2011 Elsevier Ltd. All rights reserved.
منابع مشابه
Optimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines
In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...
متن کاملOptimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines
In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...
متن کاملA Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection
K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...
متن کاملA Statistical Study of two Diffusion Processes on Torus and Their Applications
Diffusion Processes such as Brownian motions and Ornstein-Uhlenbeck processes are the classes of stochastic processes that have been investigated by researchers in various disciplines including biological sciences. It is usually assumed that the outcomes of these processes are laid on the Euclidean spaces. However, some data in physical, chemical and biological phenomena indicate that they cann...
متن کاملA New Framework for Distributed Multivariate Feature Selection
Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 45 شماره
صفحات -
تاریخ انتشار 2012